我们在随机梯度下降(SGD)算法的逃生问题上发展了定量理论,并研究了损耗表面锐度对逃逸的影响。深入学习在各个领域取得了巨大成功,但是,它开辟了各种理论开放问题。其中一个典型问题是为什么SGD可以找到通过非凸损耗概括的参数。逃生问题是一种解决这个问题的方法,该方法调查了SGD如何从本地最小值逃脱。在本文中,通过应用随机动力系统理论,我们开发了逃生问题的准势能理论。我们表明,准势理论可以以统一的方式处理损耗表面的几何特性和梯度噪声的协方差结构,同时它们在以前的作品中分别研究。我们的理论结果意味着(i)损失表面的清晰度有助于SGD的缓慢逃逸,(ii)SGD的噪声结构取消效果并指数加速逃逸。我们还通过用真实数据接受培训的神经网络进行实验来经验验证我们的理论。
translated by 谷歌翻译
This study proposes novel control methods that lower impact force by preemptive movement and smoothly transition to conventional contact impedance control. These suggested techniques are for force control-based robots and position/velocity control-based robots, respectively. Strong impact forces have a negative influence on multiple robotic tasks. Recently, preemptive impact reduction techniques that expand conventional contact impedance control by using proximity sensors have been examined. However, a seamless transition from impact reduction to contact impedance control has not yet been accomplished. The proposed methods utilize a serial combined impedance control framework to solve this problem. The preemptive impact reduction feature can be added to the already implemented impedance controller because the parameter design is divided into impact reduction and contact impedance control. There is no undesirable contact force during the transition. Furthermore, even though the preemptive impact reduction employs a crude optical proximity sensor, the influence of reflectance is minimized using a virtual viscous force. Analyses and real-world experiments confirm these benefits.
translated by 谷歌翻译
Humans demonstrate a variety of interesting behavioral characteristics when performing tasks, such as selecting between seemingly equivalent optimal actions, performing recovery actions when deviating from the optimal trajectory, or moderating actions in response to sensed risks. However, imitation learning, which attempts to teach robots to perform these same tasks from observations of human demonstrations, often fails to capture such behavior. Specifically, commonly used learning algorithms embody inherent contradictions between the learning assumptions (e.g., single optimal action) and actual human behavior (e.g., multiple optimal actions), thereby limiting robot generalizability, applicability, and demonstration feasibility. To address this, this paper proposes designing imitation learning algorithms with a focus on utilizing human behavioral characteristics, thereby embodying principles for capturing and exploiting actual demonstrator behavioral characteristics. This paper presents the first imitation learning framework, Bayesian Disturbance Injection (BDI), that typifies human behavioral characteristics by incorporating model flexibility, robustification, and risk sensitivity. Bayesian inference is used to learn flexible non-parametric multi-action policies, while simultaneously robustifying policies by injecting risk-sensitive disturbances to induce human recovery action and ensuring demonstration feasibility. Our method is evaluated through risk-sensitive simulations and real-robot experiments (e.g., table-sweep task, shaft-reach task and shaft-insertion task) using the UR5e 6-DOF robotic arm, to demonstrate the improved characterisation of behavior. Results show significant improvement in task performance, through improved flexibility, robustness as well as demonstration feasibility.
translated by 谷歌翻译
文本到图像模型最近通过光合现实质量看似准确的样本取得了巨大的成功。但是,随着最先进的语言模型仍在努力评估精确陈述,基于语言模型的图像生成过程也是如此。在这项工作中,我们展示了最先进的文本对图像模型(例如Dall-e)的问题,并通过与Draw基准基准相关的语句生成准确的样本。此外,我们表明剪辑无法始终如一地重新读取这些样品。为此,我们提出了Logicrank,这是一种神经符号推理框架,可以为这种精确要求设置提供更准确的排名系统。Logicrank平稳地集成到文本到图像模型的生成过程中,而且可以用于进一步调整更逻辑的精确模型。
translated by 谷歌翻译
会员推理攻击(MIA)在机器学习模型的培训数据上提出隐私风险。使用MIA,如果目标数据是训练数据集的成员,则攻击者猜测。对MIAS的最先进的防御,蒸馏为会员隐私(DMP),不仅需要私人数据来保护但是大量未标记的公共数据。但是,在某些隐私敏感域名,如医疗和财务,公共数据的可用性并不明显。此外,通过使用生成的对策网络生成公共数据的琐碎方法显着降低了DMP的作者报道的模型精度。为了克服这个问题,我们在不需要公共数据的情况下,使用知识蒸馏提出对米西亚的小说防御。我们的实验表明,我们防御的隐私保护和准确性与MIA研究中使用的基准表格数据集的DMP相媲美,我们的国防有更好的隐私式权限远非现有防御不使用图像数据集CIFAR10的公共数据。
translated by 谷歌翻译
Scenarios requiring humans to choose from multiple seemingly optimal actions are commonplace, however standard imitation learning often fails to capture this behavior. Instead, an over-reliance on replicating expert actions induces inflexible and unstable policies, leading to poor generalizability in an application. To address the problem, this paper presents the first imitation learning framework that incorporates Bayesian variational inference for learning flexible non-parametric multi-action policies, while simultaneously robustifying the policies against sources of error, by introducing and optimizing disturbances to create a richer demonstration dataset. This combinatorial approach forces the policy to adapt to challenging situations, enabling stable multi-action policies to be learned efficiently. The effectiveness of our proposed method is evaluated through simulations and real-robot experiments for a table-sweep task using the UR3 6-DOF robotic arm. Results show that, through improved flexibility and robustness, the learning performance and control safety are better than comparison methods.
translated by 谷歌翻译